-
Notifications
You must be signed in to change notification settings - Fork 0
Dual Storage Architecture #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dual-storage-baseline
Are you sure you want to change the base?
Conversation
* Dual writer: mode 3 * Add integration tests for playlits in mode 3 * Remove todo * Update pkg/apiserver/rest/dualwriter_mode3.go Co-authored-by: Arati R. <[email protected]> * Admin: Fixes an issue where user accounts could not be enabled (#88117) Fix: unable to enable user * [REVIEW] FInish mode 3 and add tests * Improve logging * Update dependencies * Update pkg/apiserver/rest/dualwriter_mode3_test.go Co-authored-by: maicon <[email protected]> * remove test assertion * Use mode log when dual writer is initiated --------- Co-authored-by: Arati R. <[email protected]> Co-authored-by: gonvee <[email protected]> Co-authored-by: maicon <[email protected]>
|
cursor review |
| created, err := d.Storage.Create(ctx, obj, createValidation, options) | ||
| if err != nil { | ||
| log.Error(err, "unable to create object in storage") | ||
| d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Broken Storage Performance Monitoring
When Storage.Create fails, the code incorrectly calls recordLegacyDuration instead of recordStorageDuration. This records the storage operation's duration under the wrong metric, making it impossible to accurately monitor storage performance and distinguish between legacy and storage failures.
| d.recordStorageDuration(true, mode3Str, options.Kind, method, startStorage) | ||
| return res, async, err | ||
| } | ||
| d.recordStorageDuration(false, mode3Str, name, method, startStorage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Incorrect Metric Labels Break Monitoring
In the success path of Delete, the code passes name (the object name) instead of options.Kind as the third parameter to recordStorageDuration. This causes metrics to be labeled with object names rather than resource kinds, breaking metric aggregation and making monitoring ineffective.
| upstream: objInfo, | ||
| updated: updated, | ||
| log.Error(err, "unable to update in storage") | ||
| d.recordLegacyDuration(true, mode3Str, options.Kind, method, startStorage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Metric Mix-up Corrupts Storage Observability
When Storage.Update fails, the code incorrectly calls recordLegacyDuration instead of recordStorageDuration. This records the storage operation's duration under the wrong metric, corrupting observability data and preventing accurate monitoring of storage vs legacy performance.
| ctx, cancel := context.WithTimeoutCause(ctx, time.Second*10, errors.New("legacy deletecollection timeout")) | ||
| defer cancel() | ||
| _, err := d.Legacy.DeleteCollection(ctx, deleteValidation, options, listOptions) | ||
| d.recordStorageDuration(err != nil, mode3Str, options.Kind, method, startLegacy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Legacy Metrics Pollute Storage Data
In the DeleteCollection goroutine that executes the legacy operation, the code calls recordStorageDuration instead of recordLegacyDuration. This misattributes legacy operation metrics to storage, making it impossible to accurately monitor and compare the performance of legacy vs storage systems.
| func (d *DualWriterMode3) Delete(ctx context.Context, name string, deleteValidation rest.ValidateObjectFunc, options *metav1.DeleteOptions) (runtime.Object, bool, error) { | ||
| var method = "delete" | ||
| log := d.Log.WithValues("name", name, "kind", options.Kind, "method", method) | ||
| ctx = klog.NewContext(ctx, d.Log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Delete Operations Drop Critical Logging Context
The Delete function passes d.Log to klog.NewContext instead of the enriched log variable that includes name, kind, and method values. This causes the context to lose important logging metadata, making it harder to trace and debug delete operations.
Test 6
Replicated from ai-code-review-evaluation/grafana-cursor#6
Note
Implements DualWriter Mode3 (read from Storage, write to both) with structured logging, metrics, and time-bounded async legacy operations; adds comprehensive unit and integration tests (incl. playlist) for Mode3.
pkg/apiserver/rest/dualwriter_mode3.go):Listto read fromStorageand write to bothStorageandLegacy.context.WithTimeoutCause.kind,method, etc.) and record per-operation durations viadualWriterMetrics.GetSingularName,NamespaceScoped,New,NewList,ConvertToTable).Create,Get,List,Delete,DeleteCollection,Update(pkg/apiserver/rest/dualwriter_mode3_test.go).file,unified, and (skipped)etcdbackends (pkg/tests/apis/playlist/playlist_test.go).Written by Cursor Bugbot for commit 67b74e1. Configure here.